Bayesian Neural Word Embedding

نویسنده

Oren Barkan

چکیده

Recently, several works in the domain of natural language processing presented successful methods for word embedding. Among them, the Skip-Gram (SG) with negative sampling, known also as word2vec, advanced the stateof-the-art of various linguistics tasks. In this paper, we propose a scalable Bayesian neural word embedding algorithm that can be beneficial to general item similarity tasks as well. The algorithm relies on a Variational Bayes solution for the SG objective and a detailed step by step description of the algorithm is provided. We present experimental results that demonstrate the performance of the proposed algorithm and show it is competitive with the original SG method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Word Semantic Representations using Bayesian Probabilistic Tensor Factorization

Many forms of word relatedness have been developed, providing different perspectives on word similarity. We introduce a Bayesian probabilistic tensor factorization model for synthesizing a single word vector representation and per-perspective linear transformations from any number of word similarity matrices. The resulting word vectors, when combined with the per-perspective linear transformati...

متن کامل

Word Clustering Using Word Embedding Generated by Neural Net-based Skip Gram

This paper proposes word clustering using word embedding. We used a neural net-based continuous skip-gram method for generating word embedding in continuous space. The proposed word clustering method represents each word in the vector space using a neural network. The K-means clustering method partitions word embedding into predetermined K-word

متن کامل

A New Document Embedding Method for News Classification

Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...

متن کامل

Smart Data: Where the Big Data Meets the Semantics

Big data technology is designed to address the challenges of the three Vs of big data, including volume (massive amount of data), variety (a range of data types and sources), and velocity (speed of data in and out). Big data is often captured without a specific purpose, leading to most of it being task-irrelevant data. The most important feature of data is neither the volume nor the other Vs, b...

متن کامل

Enhancing Translation Language Models with Word Embedding for Information Retrieval

In this paper, we explore the usage of Word Embedding semantic resources for Information Retrieval (IR) task. This embedding, produced by a shallow neural network, have been shown to catch semantic similarities between words (Mikolov et al., 2013). Hence, our goal is to enhance IR Language Models by addressing the term mismatch problem. To do so, we applied the model presented in the paper Inte...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Bayesian Neural Word Embedding

نویسنده

چکیده

منابع مشابه

Word Semantic Representations using Bayesian Probabilistic Tensor Factorization

Word Clustering Using Word Embedding Generated by Neural Net-based Skip Gram

A New Document Embedding Method for News Classification

Smart Data: Where the Big Data Meets the Semantics

Enhancing Translation Language Models with Word Embedding for Information Retrieval

عنوان ژورنال:

اشتراک گذاری